AITopics | edge environment

Collaborating Authors

edge environment

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Edge AI in Highly Volatile Environments: Is Fairness Worth the Accuracy Trade-off?

Zaland, Obaidullah, Awaysheh, Feras M., Zubi, Sawsan Al, Safi, Abdul Rahman, Bhuyan, Monowar

arXiv.org Artificial IntelligenceNov-4-2025

Federated learning (FL) has emerged as a transformative paradigm for edge intelligence, enabling collaborative model training while preserving data privacy across distributed personal devices. However, the inherent volatility of edge environments, characterized by dynamic resource availability and heterogeneous client capabilities, poses significant challenges for achieving high accuracy and fairness in client participation. This paper investigates the fundamental trade-off between model accuracy and fairness in highly volatile edge environments. This paper provides an extensive empirical evaluation of fairness-based client selection algorithms such as RBFF and RBCSF against random and greedy client selection regarding fairness, model performance, and time, in three benchmarking datasets (CIFAR10, FashionMNIST, and EMNIST). This work aims to shed light on the fairness-performance and fairness-speed trade-offs in a volatile edge environment and explore potential future research opportunities to address existing pitfalls in \textit{fair client selection} strategies in FL. Our results indicate that more equitable client selection algorithms, while providing a marginally better opportunity among clients, can result in slower global training in volatile environments\footnote{The code for our experiments can be found at https://github.com/obaidullahzaland/FairFL_FLTA.

artificial intelligence, federated learning, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2511.01737

Country:

Europe > Spain (0.14)
Asia > Afghanistan (0.14)

Genre: Research Report > New Finding (0.48)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Security & Privacy (0.86)

Add feedback

CoMoE: Collaborative Optimization of Expert Aggregation and Offloading for MoE-based LLMs at Edge

Li, Muqing, Li, Ning, Yuan, Xin, Xu, Wenchao, Chen, Quan, Guo, Song, Zhang, Haijun

arXiv.org Artificial IntelligenceAug-14-2025

--The proliferation of large language models (LLMs) has driven the adoption of Mixture-of-Experts (MoE) architectures as a promising solution to scale model capacity while controlling computational costs. However, deploying MoE models in resource-constrained mobile edge computing environments presents significant challenges due to their large memory footprint and dynamic expert activation patterns. T o address these challenges, we propose a novel dynamic resource-aware collaborative optimization framework that jointly optimizes expert aggregation granularity and offloading strategies based on real-time device resource states, network conditions, and input characteristics in mobile edge environments, denoted as CoMoE. In CoMoE, we first systematically analyze existing expert aggregation techniques, including expert parameter merging, knowledge distillation, and parameter sharing decomposition, identifying their limitations in dynamic mobile environments. We then investigate expert offloading strategies encompassing expert prediction and prefetching, expert caching and scheduling, and multi-tier storage architectures, revealing the interdependencies between routing decisions and offloading performance. The CoMoE incorporates adaptive scheduling mechanisms that respond to user mobility and varying network conditions, enabling efficient MoE deployment across heterogeneous edge devices. Extensive experiments on real mobile edge testbeds demonstrate that CoMoE achieves approximately 70% reduction in memory usage compared to baseline methods, 10.5% lower inference latency than existing expert offloading techniques, while maintaining model performance stability. For large-scale MoE models (e.g., 7.4B-parameter Switch-Base-128), the CoMoE reduces memory requirements from 15.6GB to 4.7GB, enabling deployment on resource-constrained mobile edge devices that previously could only support much smaller models. With the rapid advancement of artificial intelligence technology, Large Language Models (LLMs) have demonstrated unprecedented capabilities in natural language processing, computer vision, and other domains. However, as model scales continue to expand, computational efficiency and memory constraints have become critical challenges in practical model deployment. The Mixture of Experts (MoE) architecture emerges as a promising solution that effectively scales the model capacity while controlling computational costs through sparse activation mechanisms.

artificial intelligence, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2508.09208

Country: Asia > China (0.28)

Genre: Research Report > New Finding (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Jupiter: Fast and Resource-Efficient Collaborative Inference of Generative LLMs on Edge Devices

Ye, Shengyuan, Ouyang, Bei, Zeng, Liekang, Qian, Tianyi, Chu, Xiaowen, Tang, Jian, Chen, Xu

arXiv.org Artificial IntelligenceApr-14-2025

--Generative large language models (LLMs) have garnered significant attention due to their exceptional capabilities in various AI tasks. Traditionally deployed in cloud datacenters, LLMs are now increasingly moving towards more accessible edge platforms to protect sensitive user data and ensure privacy preservation. The limited computational resources of individual edge devices, however, can result in excessively prolonged inference latency and overwhelmed memory usage. While existing research has explored collaborative edge computing to break the resource wall of individual devices, these solutions yet suffer from massive communication overhead and under-utilization of edge resources. Furthermore, they focus exclusively on optimizing the prefill phase, neglecting the crucial autoregressive decoding phase for generative LLMs. T o address that, we propose Jupiter, a fast, scalable, and resource-efficient collaborative edge AI system for generative LLM inference. Jupiter introduces a flexible pipelined architecture as a principle and differentiates its system design according to the differentiated characteristics of the prefill and decoding phases. For prefill phase, Jupiter submits a novel intra-sequence pipeline parallelism and develops a meticulous parallelism planning strategy to maximize resource efficiency; For decoding, Jupiter devises an effective outline-based pipeline parallel decoding mechanism combined with speculative decoding, which further magnifies inference acceleration. Extensive evaluation based on realistic implementation demonstrates that Jupiter remarkably outperforms state-of-the-art approaches under various edge environment setups, achieving up to 26. 1 end-to-end latency reduction while rendering on-par generation quality. I NTRODUCTION The emergence of generative large language models (LLMs) has attracted widespread attention from both industry and academia owing to their exceptional capabilities in a wide range of artificial intelligence (AI) tasks. These models, widely deployed in cloud datacenters equipped with powerful server-grade GPUs, have driven increasing intelligent edge applications such as ChatBot [1] and smart-home AI agent [2].

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2504.08242

Country: Asia > China (0.28)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Industry: Information Technology > Security & Privacy (0.54)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

CHESTNUT: A QoS Dataset for Mobile Edge Environments

Zou, Guobing, Zhao, Fei, Hu, Shengxiang

arXiv.org Artificial IntelligenceOct-24-2024

Quality of Service (QoS) is an important metric to measure the performance of network services. Nowadays, it is widely used in mobile edge environments to evaluate the quality of service when mobile devices request services from edge servers. QoS usually involves multiple dimensions, such as bandwidth, latency, jitter, and data packet loss rate. However, most existing QoS datasets, such as the common WS-Dream dataset, focus mainly on static QoS metrics of network services and ignore dynamic attributes such as time and geographic location. This means they should have detailed the mobile device's location at the time of the service request or the chronological order in which the request was made. However, these dynamic attributes are crucial for understanding and predicting the actual performance of network services, as QoS performance typically fluctuates with time and geographic location. To this end, we propose a novel dataset that accurately records temporal and geographic location information on quality of service during the collection process, aiming to provide more accurate and reliable data to support future QoS prediction in mobile edge environments.

artificial intelligence, data mining, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2410.19248

Country:

Asia > China > Shanghai > Shanghai (0.06)
North America > United States (0.04)
Asia > Singapore (0.04)

Genre: Research Report (1.00)

Industry:

Information Technology (1.00)
Transportation > Ground > Road (0.68)
Telecommunications > Networks (0.48)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Data Science > Data Mining (0.93)

Add feedback

Asteroid: Resource-Efficient Hybrid Pipeline Parallelism for Collaborative DNN Training on Heterogeneous Edge Devices

Ye, Shengyuan, Zeng, Liekang, Chu, Xiaowen, Xing, Guoliang, Chen, Xu

arXiv.org Artificial IntelligenceAug-15-2024

On-device Deep Neural Network (DNN) training has been recognized as crucial for privacy-preserving machine learning at the edge. However, the intensive training workload and limited onboard computing resources pose significant challenges to the availability and efficiency of model training. While existing works address these challenges through native resource management optimization, we instead leverage our observation that edge environments usually comprise a rich set of accompanying trusted edge devices with idle resources beyond a single terminal. We propose Asteroid, a distributed edge training system that breaks the resource walls across heterogeneous edge devices for efficient model training acceleration. Asteroid adopts a hybrid pipeline parallelism to orchestrate distributed training, along with a judicious parallelism planning for maximizing throughput under certain resource constraints. Furthermore, a fault-tolerant yet lightweight pipeline replay mechanism is developed to tame the device-level dynamics for training robustness and performance stability. We implement Asteroid on heterogeneous edge devices with both vision and language models, demonstrating up to 12.2x faster training than conventional parallelism methods and 2.1x faster than state-of-the-art hybrid parallelism methods through evaluations. Furthermore, Asteroid can recover training pipeline 14x faster than baseline methods while preserving comparable throughput despite unexpected device exiting and failure.

asteroid, edge device, parallelism, (15 more...)

arXiv.org Artificial Intelligence

2408.08015

Country:

North America > United States > District of Columbia > Washington (0.05)
North America > United States > New York > New York County > New York City (0.04)
Asia > China > Hong Kong (0.04)
(3 more...)

Genre:

Workflow (0.69)
Research Report (0.64)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

RED-CT: A Systems Design Methodology for Using LLM-labeled Data to Train and Deploy Edge Classifiers for Computational Social Science

Farr, David, Manzonelli, Nico, Cruickshank, Iain, West, Jevin

arXiv.org Artificial IntelligenceAug-15-2024

Large language models (LLMs) have enhanced our ability to rapidly analyze and classify unstructured natural language data. However, concerns regarding cost, network limitations, and security constraints have posed challenges for their integration into work processes. In this study, we adopt a systems design approach to employing LLMs as imperfect data annotators for downstream supervised learning tasks, introducing novel system intervention measures aimed at improving classification performance. Our methodology outperforms LLM-generated labels in seven of eight tests, demonstrating an effective strategy for incorporating LLMs into the design and deployment of specialized, supervised learning models present in many industry use cases.

classification, classifier, detection, (16 more...)

arXiv.org Artificial Intelligence

2408.08217

Country:

North America > United States > California > San Diego County > San Diego (0.04)
North America > Mexico > Mexico City > Mexico City (0.04)
North America > Dominican Republic (0.04)
(3 more...)

Genre: Research Report > New Finding (0.67)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Add feedback

Galaxy: A Resource-Efficient Collaborative Edge AI System for In-situ Transformer Inference

Ye, Shengyuan, Du, Jiangsu, Zeng, Liekang, Ou, Wenzhong, Chu, Xiaowen, Lu, Yutong, Chen, Xu

arXiv.org Artificial IntelligenceMay-27-2024

Transformer-based models have unlocked a plethora of powerful intelligent applications at the edge, such as voice assistant in smart home. Traditional deployment approaches offload the inference workloads to the remote cloud server, which would induce substantial pressure on the backbone network as well as raise users' privacy concerns. To address that, in-situ inference has been recently recognized for edge intelligence, but it still confronts significant challenges stemming from the conflict between intensive workloads and limited on-device computing resources. In this paper, we leverage our observation that many edge environments usually comprise a rich set of accompanying trusted edge devices with idle resources and propose Galaxy, a collaborative edge AI system that breaks the resource walls across heterogeneous edge devices for efficient Transformer inference acceleration. Galaxy introduces a novel hybrid model parallelism to orchestrate collaborative inference, along with a heterogeneity-aware parallelism planning for fully exploiting the resource potential. Furthermore, Galaxy devises a tile-based fine-grained overlapping of communication and computation to mitigate the impact of tensor synchronizations on inference latency under bandwidth-constrained edge environments. Extensive evaluation based on prototype implementation demonstrates that Galaxy remarkably outperforms state-of-the-art approaches under various edge environment setups, achieving up to 2.5x end-to-end latency reduction.

edge device, inference, opération, (15 more...)

arXiv.org Artificial Intelligence

2405.17245

Country:

Asia > China > Guangdong Province > Guangzhou (0.05)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
North America > United States > New York (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.90)

Add feedback

ECLM: Efficient Edge-Cloud Collaborative Learning with Continuous Environment Adaptation

Zhuang, Yan, Zheng, Zhenzhe, Shao, Yunfeng, Li, Bingshuai, Wu, Fan, Chen, Guihai

arXiv.org Artificial IntelligenceNov-18-2023

Pervasive mobile AI applications primarily employ one of the two learning paradigms: cloud-based learning (with powerful large models) or on-device learning (with lightweight small models). Despite their own advantages, neither paradigm can effectively handle dynamic edge environments with frequent data distribution shifts and on-device resource fluctuations, inevitably suffering from performance degradation. In this paper, we propose ECLM, an edge-cloud collaborative learning framework for rapid model adaptation for dynamic edge environments. We first propose a novel block-level model decomposition design to decompose the original large cloud model into multiple combinable modules. By flexibly combining a subset of the modules, this design enables the derivation of compact, task-specific sub-models for heterogeneous edge devices from the large cloud model, and the seamless integration of new knowledge learned on these devices into the cloud model periodically. As such, ECLM ensures that the cloud model always provides up-to-date sub-models for edge devices. We further propose an end-to-end learning framework that incorporates the modular model design into an efficient model adaptation pipeline including an offline on-cloud model prototyping and training stage, and an online edge-cloud collaborative adaptation stage. Extensive experiments over various datasets demonstrate that ECLM significantly improves model performance (e.g., 18.89% accuracy increase) and resource efficiency (e.g., 7.12x communication cost reduction) in adapting models to dynamic edge environments by efficiently collaborating the edge and the cloud models.

cloud model, edge device, module, (13 more...)

arXiv.org Artificial Intelligence

2311.11083

Country:

Asia > China > Shanghai > Shanghai (0.05)
Asia > China > Beijing > Beijing (0.04)
North America > United States > Virginia (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (0.46)
Information Technology > Services (0.34)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Communications (1.00)
Information Technology > Cloud Computing (1.00)
(3 more...)

Add feedback

AdaptiveNet: Post-deployment Neural Architecture Adaptation for Diverse Edge Environments

Wen, Hao, Li, Yuanchun, Zhang, Zunshuai, Jiang, Shiqi, Ye, Xiaozhou, Ouyang, Ye, Zhang, Ya-Qin, Liu, Yunxin

arXiv.org Artificial IntelligenceMar-13-2023

Deep learning models are increasingly deployed to edge devices for real-time applications. To ensure stable service quality across diverse edge environments, it is highly desirable to generate tailored model architectures for different conditions. However, conventional pre-deployment model generation approaches are not satisfactory due to the difficulty of handling the diversity of edge environments and the demand for edge information. In this paper, we propose to adapt the model architecture after deployment in the target environment, where the model quality can be precisely measured and private edge data can be retained. To achieve efficient and effective edge model generation, we introduce a pretraining-assisted on-cloud model elastification method and an edge-friendly on-device architecture search method. Model elastification generates a high-quality search space of model architectures with the guidance of a developer-specified oracle model. Each subnet in the space is a valid model with different environment affinity, and each device efficiently finds and maintains the most suitable subnet based on a series of edge-tailored optimizations. Extensive experiments on various edge devices demonstrate that our approach is able to achieve significantly better accuracy-latency tradeoffs (e.g. 46.74\% higher on average accuracy with a 60\% latency budget) than strong baselines with minimal overhead (13 GPU hours in the cloud and 2 minutes on the edge server).

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2303.07129

Country:

Europe > Spain > Galicia > Madrid (0.05)
North America > United States > New York > New York County > New York City (0.05)
Asia > China > Shanghai > Shanghai (0.04)
North America > United States > New Mexico > Santa Fe County > Santa Fe (0.04)

Genre: Research Report (1.00)

Industry: Information Technology (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

AI/ML at the Edge: 4 things CIOs should know

#artificialintelligenceFeb-14-2023, 18:50:59 GMT

And latency almost always matters when it comes to running artificial intelligence/machine learning (AI/ML) workloads. Great AI requires a lot of data, and it demands it immediately." That's both the blessing and the curse in any sector – industrial and manufacturing are prominent examples, but the principle applies widely across businesses – that generates tons of machine data outside of their centralized clouds or data centers and wants to feed it to an ML model or other form of automation for any number of purposes. Whether you're working with IoT data on a factory floor, or medical diagnostic data in a healthcare facility – or one of many other scenarios where AI/ML use cases are rolling out – you probably can't do so optimally if you're trying to send everything (or close to it) on a round-trip from the edge to the cloud and back again. In fact, if you're dealing with huge volumes of data, your trip might never get off the ground. "I've seen situations in manufacturing facilities ...

computing, mcdermott, sathianathan, (16 more...)

#artificialintelligence

Industry:

Health & Medicine (0.55)
Information Technology > Services (0.37)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback